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Applying new computing paradigms like quantum computing to the field of machine learning has recently 
gained attention. However, as high-dimensional real-world applications are not yet feasible to be solved us- 
ing purely quantum hardware, hybrid methods using both classical and quantum machine learning paradigms 
have been proposed. For instance, transfer learning methods have been shown to be successfully applicable 
to hybrid image classification tasks. Nevertheless, beneficial circuit architectures still need to be explored. 
Therefore, tracing the impact of the chosen circuit architecture and parameterization is crucial for the devel- 
opment of beneficially applicable hybrid methods. However, current methods include processes where both 
parts are trained concurrently, therefore not allowing for a strict separability of classical and quantum impact. 
Thus, those architectures might produce models that yield a superior prediction accuracy whilst employing the 
least possible quantum impact. To tackle this issue, we propose Sequential Quantum Enhanced Training (SE- 
QUENT) an improved architecture and training process for the traceable application of quantum computing 
methods to hybrid machine learning. Furthermore, we provide formal evidence for the disadvantage of current 


methods and preliminary experimental results as a proof-of-concept for the applicability of SEQUENT. 


1 INTRODUCTION 


With classical computation evolving towards per- 
formance saturation, new computing paradigms like 
quantum computing arise, promising superior perfor- 
mance in complex problem domains. However, cur- 
rent architectures merely reach numbers of 100 quan- 
tum bits (qubits), prone to noise, and classical com- 
puters run out of resources simulating similar sized 
systems (Preskill, 2018). Thus, most real world appli- 
cations are not yet feasible solely relying on quantum 
compute. Especially in the field of machine learn- 
ing, where parameter spaces sized upwards of 50 mil- 
lion are required for tasks like image classification, 
the resources of current quantum hardware or simula- 
tors is not yet sufficient for pure quantum approaches 
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(He et al., 2016). Therefore, hybrid approaches have 
been proposed, where the power of both classical and 
quantum computation are united for improved results 
(Bergholm et al., 2018). By this, it is possible to lever- 
age the advantages of quantum computing for tasks 
with parameter spaces that cannot be computed solely 
by quantum computers due to hardware and simula- 
tion limitations. Within those hybrid algorithms the 
quantum part is, analogue to the classical deep neu- 
ral networks (DNNs), represented by so called vari- 
ational quantum circuits (VQCs), which are param- 
eterized and can be trained in a supervised manner 
using labeled data (Cerezo et al., 2021). For hybrid 
machine learning, we will from hereon refer to VQCs 
as quantum parts and to DNNs as classical parts. 

To solve large-scale real-world tasks, like image 
classification, the concept of transfer learning has 
been applied for training such hybrid models (Gir- 
shick et al., 2014; Pan and Yang, 2010). Given a com- 
plex model, with high-dimensional input- and param- 
eter spaces, the term transfer leaning classically refers 
to the two-step procedures of first pre-training using a 
large but generic dataset and secondly fine-tuning us- 


ing a smaller but more specific dataset (Torrey and 
Shavlik, 2010). Usually, a subset of the model’s 
weights are frozen for the fine-tuning to compensate 
for insufficient amounts of fine-tuning data. 

Applied to hybrid guantum machine learning 
(QML), the pre-trained model is used as a feature ex- 
tractor and the dense classifier is replaced by a hybrid 
model referred to as dressed quantum circuit (DQC) 
including classical pre- and post-processing layers, 
and the central VQC (Mari et al., 2020). This archi- 
tecture results in concurrent updates to both classical 
and quantum weights. Even though, this produces up- 
dates towards overall optimal classification results, it 
does not allow for tracing the advantageousness of the 
quantum part of the architecture. Thus, besides pro- 
viding competitive classification results, such hybrid 
approaches do not allow for valid judgment whether 
the chosen quantum circuit benefits the classification. 
The only arguable result is that it does not harm the 
overall performance or that the introduced inaccura- 
cies may be compensated by the classical layers in the 
end. However, as we currently are still only exploring 
VQCs, this verdict, i.e. traceability of the impact of 
both the quantum and the classical part, is crucial to 
infer the architecture quality from common metrics. 
Overall, with current approaches we find a mismatch 
between the goal of exploring viable architectures and 
the process applied. 

We therefore propose the application of Sequen- 
tial Quantum Enhanced Training (SEQUENT), an 
adapted architecture and training procedure for hybrid 
quantum transfer learning, where the effect of both 
classical and quantum parts are separably assessable. 
We see our contributions as follows: 


e We provide formal evidence that current quantum 
transfer learning architectures might result in an 
optimal network configuration (perfect classifica- 
tion / regression results) with the least-most quan- 
tum impact, i.e., a solution equivalent to a purely 
classical one. 


We propose SEQUENT, a two-step procedure of 
classical pre-training and quantum fine-tuning us- 
ing an adapted architecture to reduce the number 
of features classically extracted to the number of 
features manageable by the VQC producing the 
final classification. 


We show competitive results with a traceable im- 
pact of the chosen VQC on the overall perfor- 
mance using preliminary benchmark datasets. 


2 BACKGROUND 


To delimit SEQUENT, the following section pro- 
vides a brief general introduction to the related fields 
of quantum computation, quantum machine learning, 
deep learning and transfer learning. 


2.1 Quantum Computing 


Quantum Computation works fundamentally dif- 
ferent than classical computation, since QC uses 
qubits instead of classical bits. Where classical bit 
can be in the state 0 or 1, the corresponding state of 
a qubit is described in Dirac notation as | 0) and | 1). 
However, more importantly, qubits can be in a super- 
position, i.e., a linear combination of both: 


| vw) =a |0)+8B| 1) (1) 


To alter this state, a set of reversible unitary op- 
erations like rotations can be applied sequentially to 
individual target qubits or in conjunction with a con- 
trol qubit. Upon measurement, the superposition col- 
lapses and the qubit takes on either the state | 0) or 
| 1) according to a probability. Note that œ and B in 
(1) are complex numbers where |  |* and | B |? give 
the probability of measuring the qubit in state | 0) or 
| 1) respectively. Note that | & |? + | B |?= 1, i.e., the 
probabilities sum up to 1. (Nielsen and Chuang, 2010) 

Quantum algorithms like Grover (Grover, 1996) 
or Shor (Shor, 1994) provide a theoretical speedup 
compared to classical algorithms. Moreover, in 2019 
quantum supremacy was claimed (Arute et al., 2019), 
and the race to find more algorithms providing a quan- 
tum advantage is currently underway. However, the 
current state of quantum computing is often referred 
to as the noisy-intermediate-scale quantum (NISQ) 
era (Preskill, 2018), a period when relatively small 
and noisy quantum computers are available, however, 
still no error-correction to mitigate them, limiting the 
execution to small quantum circuits. Furthermore, 
current quantum computers are not yet capable to ex- 
ecute algorithms that provide any quantum advantage 
in a practically useful setting. Thus, much research 
has recently been put into the investigation of hybrid- 
classical-quantum algorithms. That is, algorithms that 
consist of quantum and classical parts, each responsi- 
ble for a distinct task. In this regard, quantum ma- 
chine learning has been gaining in popularity. 


Quantum Machine Learning algorithms have 
been proposed in several varieties over the last years 
(Farhi et al., 2014; Dong et al., 2008; Biamonte et al., 
2017). Besides quantum kernel methods (Schuld 
and Killoran, 2019) variational quantum algorithms 


(VQAs) seem to be the most relevant in the current 
NISQ-era for various reasons (Cerezo et al., 2021). 

VQAs generally are comprised of multiple com- 
ponents, but the central part is the structure of the ap- 
plied circuit or Ansatz. Furthermore, a VQA Ansatz 
is intrinsically parameterized in order to use it as a 
predictive model by optimizing the parameterization 
towards a given objective, i.e. to minimize a given 
loss. Overall, given a set of data and targets, a param- 
eterized circuit and an objective, an approximation 
of the generator underlying the data can be learned. 
Applying methods like gradient descent, this model 
can be trained to predict the label of unseen data 
(Cerezo et al., 2021; Mitarai et al., 2018). For the 
field of QML, various circuit architectures have been 
proposed (Biamonte et al., 2017; Khairy et al., 2020; 
Schuld et al., 2020). 

For the remainder of this paper, we consider the 
following simple o-parameterized variational quan- 
tum circuit (VQC) for n qubits: 


VQC»(z) = meassure,oentangley.o---o 


oentangle,, o embed, (z) (2) 


with the depth 6, and the output dimension © given 
the input z = (z1,...,2y), where embed, loads the 
data-points z into n balanced qubits in superposi- 
tion via z-rotations, entangle, applies controlled 
not gates to entangle neighboring qubits followed by 
o-parameterized z rotations, and measures applies 
the Pauli-Z operator and measures the first © qubits 
(Schuld and Killoran, 2019; Mitarai et al., 2018). 

This architecture has also been shown to be di- 
rectly applicable to classification tasks, using the 
measurement expectation value as a one-hot encoded 
prediction of the target (Schuld et al., 2020). 

Overall, VQAs have been shown to be applica- 
ble to a wide variety of classification tasks (Abo- 
hashima et al., 2020) and successfully utilized by 
Mari et al. (2020), using the simple architecture de- 
fined in (2). Thus, to provide a proof-of-concept for 
SEQUENT, we will focus on said architecture for 
classification tasks and leave the optimization of em- 
beddings (LaRose and Coyle, 2020) and architectures 
(Khairy et al., 2020) to future research. 


2.2 Deep Learning 


Deep Neural Networks (DNNs) refer to parame- 
terized networks consisting of a set of fully-connected 
layers. A layer comprises a set of distinct neu- 
rons, whereas each neuron takes a vector of inputs 
x = (%1,x2,...X,), which is multiplied with the cor- 
responding weight vector wj = (w,l,wj2,...wjn). A 


bias b; is added before being passed into an activa- 
tion function @. Therefore, the output of neuron z 
at position j takes the following form (Bishop and 
Nasrabadi, 2006): 


n 
= (Ewin) (3) 
i=l 
Given a target function f(x) : X +> y, we can de- 
fine the approximate 


falx): X > f =Ljyso 0+ OLn shy (4) 


as a composition of multiple layers L with multiple 
neurons z parameterized by ©, d — 1 h-dimensional 
hidden layers, and the respective input and target di- 
mensions n and o. Using the prediction error J = (y — 
fo(x))*, fo can be optimized by propagating the er- 
ror backwards through the network using the gradient 
VoJ/ (Bishop and Nasrabadi, 2006). Those feed for- 
ward models have been shown capable of approximat- 
ing arbitrary functions, given a sufficient amount of 
data and either a sufficient depth (i.e. number of hid- 
den layers) or width (i.e. size of hidden state) (Leshno 
et al., 1993). Deep neural networks for image classifi- 
cation tasks are comprised of two parts: A feature ex- 
tractor containing a composite of convolutional layers 
to extract a v-sized vector of features FE : X+> v, and 
a composite of fully connected layers to classify the 
extracted feature vector FC: v +> ¥. Thus, the overall 
model is defined as f : X > f = FCg oFEg(x). Those 
models have been successfully applied to a wide vari- 
ety of real-world classification tasks (He et al., 2016; 
Krizhevsky et al., 2012). However, to find a parame- 
terization that optimally separates the given dataset, a 
large amount of training data is required. 


Transfer Learning aims to solve the problem of in- 
sufficient training data by transferring already learned 
knowledge (weights, biases) from a task T, of a source 
domain D, to a related target task T; of a target do- 
main D;. More specifically, a domain D = X, P(x) 
comprises a feature space X and the probability dis- 
tribution P(x) where x = (x1,X2,...,%») € X. The cor- 
responding task T is given by T = {y, f(x)} with la- 
bel space y and target function f(x) (Zhuang et al., 
2021). A deep transfer learning task is defined by 
(Ds, Ts, Di, T,, f;(-)), where f,(-) is defined according 
to Equation 4 (Tan et al., 2018). Generally, transfer 
learning is a two-stage process. Initially, a source 
model is trained according to a specific task T, in the 
source domain Ds. Consequently, transfer learning 
aims to enhance the performance of the target predic- 
tive function f,(-) for the target learning task T; in tar- 
get domain D, by transferring latent knowledge from 
T; in Ds, where D; Æ D; and/or Ts Æ T;. Usually, the 


size of D; >> D; (Tan et al., 2018). The knowledge 
transfer and learning step is commonly achieved via 
feature extraction and/or fine-tuning. 

The feature extraction process freezes the source 
model and adds a new classifier to the output of the 
pre-trained model. Thereby, the feature maps learned 
from T; in Ds can be repurposed and the newly-added 
classifier is trained according to the target task T, 
(Donahue et al., 2014). The fine-tuning process ad- 
ditionally unfreezes top layers from the source model 
and jointly trains the unfreezed feature representa- 
tions from the source model with the added classifier. 
By this, the time and space complexity for the tar- 
get task T, can be reduced by transferring and/or fine- 
tuning the already learned features of a pre-trained 
source model to a target model (Girshick et al., 2014). 


3 RELATED WORK 


In the context of machine learning, VQAs are of- 
ten applied to the problem of classification (Schuld 
et al., 2020; Mitarai et al., 2018; Havlíček et al., 2019; 
Schuld and Killoran, 2019), although other applica- 
tion areas exist. Different techniques, e.g. embed- 
ding (Lloyd et al., 2020; LaRose and Coyle, 2020), 
or problems, e.g. barren plateaus (McClean et al., 
2018), have been widely discussed in the QML liter- 
ature. However, we focus on hybrid quantum transfer 
learning (Mari et al., 2020) in this paper. 

Classical Transfer Learning is widely applied in 
present-day machine learning algorithms (Torrey and 
Shavlik, 2010; Pan and Yang, 2010; Pratt, 1992) and 
can be extended with concepts of the emerging quan- 
tum computing technology (Zen et al., 2020). Mari 
et al. (2020) propose various hybrid transfer learning 
architectures ranging from classical to quantum (CQ), 
quantum to classical (QC) and quantum to quantum 
(QQ). The authors focus on the former CQ architec- 
ture, which which comprises the previously explained 
DQC-. In the current era of intermediate-scale quan- 
tum technology the DQC transfer learning approach 
is the most widely investigated and applied one, as 
it allows to some extend optimally pre-process high- 
dimensional data and afterwards load the most rele- 
vant features into a quantum computer. Gokhale et al. 
(2020) used this architecture to classify and detect im- 
age splicing forgeries, while Acar and Yilmaz (2021) 
applied it to detect COVID-19 from CT images. Also, 
Mari et al. (2020) assess their approach exemplary on 
image classification tasks. Although the results are 
quite promising it is not clear from the evaluation, 
whether the dressed quantum circuit is advantageous 
over a fully classical approach. 


4 DQC QUANTUM IMPACT 


We argue that within certain problem instance DQCs 
may yield accurate results while not making active 
use of any quantum effects in the VQC. This possi- 
bility exists especially for easy to solve problem in- 
stances, when all purely classical layers are sufficient 
to yield accurate results and the quantum layer rep- 
resents the identity. This can be seen by realizing 
that the classical pre-processing layer acts as a hid- 
den layer with a non-polynomial activation function, 
hence being capable of approximating arbitrary con- 
tinuous functions depending on the number of hidden 
units by the universal approximation theorem (Leshno 
et al., 1993). Therefore, the overall DQC architecture 
is portrayed in Figure 1. 

The central VQC is defined according to sec- 
tion 2.1 as introduced above. Both pre- and post- 
processing layers are implemented by fully connected 
layers of neurons with a non-linear activation function 
according to subsection 2.2. Formally, the DQC for n 
qubits can thus be depicted as: 


DQC = In o VQCy o Ln (5) 


where Ly, and Lyo are the fully connected clas- 
sical dressing layers according to Equation 3, map- 
ping from the input size n to the number of qubits n 
and from the number of qubits 7 to the target size © 
respectively, and VQCy is the actual variational quan- 
tum circuit according to Equation 2 with n qubits and 
© = n measured outputs. 

Now let us consider a parameterization ọ, where 
VQC4(z) = id(z) = z resembles the identity function. 
Consequently (5) collapses into the following purely 
classical, 2-layer feed-forward network with the hid- 
den dimension n: 


DQC = Ly 400° id OLpsy = Ly0 ° Ln>n (6) 


By the universal function approximation theorem, 
this allows DQC to approximate any polynomial func- 
tion f : R” > R° of degree 1 arbitrarily well, even if 
the VQC is not affecting the prediction at all. 

Consequently, one has to be careful in the selec- 
tion of suitable problem instances, as they must not 
be too easy in order to ensure that the VQC is even 


Figure 1: Dressed Quantum Circuit Architecture 


needed to yield the desired results. This becomes es- 
pecially difficult as current quantum hardware is quite 
limited, typically restricting the choice to fairly easy 
problem instances. On top of this, no necessity to use 
a post-processing layer seems apparent, as it has been 
shown in various publications (Schuld et al., 2020; 
Schuld and Killoran, 2019) that variational quantum 
classifiers, i.e, VQCs can successfully complete clas- 
sification tasks without any post-processing. Overall, 
whilst conveying a proof-of-concept, that the com- 
bination of classical neural networks and variational 
quantum circuits in the dressed quantum circuit hy- 
brid architecture is able to produce competitive re- 
sults, this architecture is neither able to convey the 
advantageousness of the chosen quantum circuit nor 
exclude the possibility of the classical part just being 
able to compensate for quantum in-steadiness. 


5 SEQUENT 


To improve the traceability of quantum impact in hy- 
brid architectures, we propose Sequential Quantum 
Enhanced Training. SEQUENT improves upon the 
dressed quantum circuit architecture by introducing 
two adaptations to it: First, we omit the classical 
post-processing layer and use the variational quan- 
tum circuit output directly as the classification result. 
Therefore we reduce the measured outputs o from the 
number of qubits n (cf. Figure 1) to the dimension of 
the target § (cf. Figure 2). 

The direct use of VQCs as a classifier has been 
frequently proposed and shown equally applicable as 
classical counterparts (Schuld et al., 2020). By this, 
the overall quality of the chosen circuit and parame- 
terization are directly assessable by the classification 
result, thus the final accuracy. Moreover, a parame- 
ter setting of universal approximation capabilities (cf. 
Equation 6) with the least (identitary) quantum con- 
tribution is mathematically precluded by the removal 
of the hidden state (compare Equation 5). 

Concurrently omitting the pre-processing or com- 
pression layer however would increase the number of 
at least required qubits to the number of output fea- 
tures of the problem domain, or, when applied to im- 
age classification, the chosen feature extractor (e.g. 
512 for Resnet-18). However, both current quantum 
hardware and simulators do not allow for arbitrate 
sized circuits, especially maxing out at around 100 
qubits. 


Figure 2: SEQUENT Architecture: Sequential Quantum 
Enhanced Training comprised of a classical compression 
layer (CCL) parameterized by © and a variational quantum 
circuit (VQC) parameterized by with separate phases for 
classical (blue) and quantum (green) training for variable 
sets of input data X, prediction targets ŷ and VQCs with n 
qubits and 6 entangling layers. 


We therefore secondly propose to maintain the 
classical compression layer to provide a map- 
ping/compression X ++ 1 and, in order to fully clas- 
sically pre-train the compression layer, add a surro- 
gate classical classification layer ņ +> 9. Replacing 
this surrogate classical classification layer with the 
chosen variational quantum circuit to be assessed and 
freezing the pre-trained weights of the classical com- 
pression layer then allows for a second, purely quan- 
tum training phase and yield the following sequential 
training procedure depicted in Figure 3: 


1. Pre-train SEQUENT: f : X > N > $=CCLe(x) o 
CCLe(z) containing a classical compression layer 
and a surrogate classification layer by optimizing 
the classical weights 0 


2. Freeze the classical weights 8, replace the sur- 
rogate classical classification layer by the vari- 
ational quantum classification circutit VQC9(cf. 
Equation 2) and optimize the quantum weights 9. 


This two-step procedure can be seen as an applica- 
tion of transfer learning on its own, transferring from 
classical to quantum weights in a hybrid architecture. 

Overall, the SEQUENT architecture displayed in 
Figure 2 can be formalized as: 


SEQUENT o : X > N — F = VQCyg(z)oCCLe(x) (7) 
CCLe(x): XH N = Lnn 
VQC4(z) NH f 


(cf. Equation 3) 
(cf. Equation 2) 


CCL(x|8): X => n) 


Compression Layer 


Figure 3: SEQUENT Training Process consisting of 
a classical (blue) pre-training phase (1) and a quantum 
(green) fine-tuning phase (2). 


To be used for the classification of high- 
dimensional data, like images, the input x needs to 
be replaced by the intermediate output of an image 
recognition model z (cf. subsection 2.2). Combining 
both two-step transfer learning procedures, the fol- 
lowing three-step procedure is yielded: 


1. Classically pre-train a full classification model 
(e.g. Resnet (He et al., 2016)) f: KH v > ĵ = 
FCg(z) o FEg(x) to a large generic dataset (com- 
pare subsection 2.2) 


2. Freeze convolutional feature extraction layers FE 
and fine-tune fully-connected layers consisting of 
a compression layer and a surrogate classification 
layer FE: V > N > f = CCLe (z) o CCLe (x). 


3. Freeze classical weights and replace surrogate 
classification layer with VQC to train the quan- 
tum weights © of the hybrid model: 
fog: X = DH 14 F = VQCg(z) o CCLe(x) o FE 


For a classification task with n classes, at least n > n 
qubits are required. Whilst we use the simple Ansatz 
introduced in Equation 2 with ņ = 6 qubits and a 
circuit depth of 6 = 10 to validate our approach in 
the following, any VQC architecture yielding a direct 
classification result would be conceivable. 


6 EVALUATION 


We evaluate SEQUENT by comparing its perfor- 
mance to its predecessor, the DQC, and a purely clas- 
sical feed forward neural network. All models were 
trained on 2000 datapoints of the moons and spirals 
(Lang and Witbrock, 1988) benchmark dataset for 
two and four epochs of sequential, hybrid and clas- 
sical training respectively. To guarantee comparabil- 
ity, we set the size of the hidden state of the classical 
model to h = n = 6. The code for all experiments 
is available here!. The classification results are vi- 
sualized in Figure 4. Looking at the result for the 
moons dataset, all compared models are able to de- 
pict the shape underlying data. Note, that even the 
considerably simpler classical model is perfectly able 
to separate the given classes. Hence, these experi- 
mental results support the concerns about the impact 
of the VQC to the overall DQC’s performance (cf. 
section 4). With a final test accuracy of 95%, the 
DQC performs even worse than the purely classical 
model reaching 96%. Looking at the SEQUENT re- 
sults however, these concerns are eliminated, as the 
performance and final accuracy of 97%, besides out- 
performing both compared models, can certainly be 
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Figure 4: Classification Results of SEQUENT, DQC and 
Classical Feed Forward Neural Network for moons (left) 
and spirals (right) benchmark datasets 


denoted to VQC, due to the applied training process 
and the used architecture. Similar results show for the 
second benchmark dataset of intertwined spirals on 
the right side of Figure 4. The overall best accuracy 
of 86% however suggests, that further adjustments to 
the VQC could be beneficial. This result also depicts 
the application of SEQUENT we imagine for bench- 
marking and optimizing VQC architectures. 


7 CONCLUSIONS 


We proposed Sequential Quantum Enhanced Train- 
ing (SEQUENT), a two-step transfer learning proce- 
dure applied to training hybrid QML algorithms com- 
bined with an adapted hybrid architecture to allow 
for tracing both the classical and quantum impact on 
the overall performance. Furthermore, we showed 
the need for said adaptions by formally pointing out 
weaknesses of the DQC, the current state-of-the-art 
approach to this regard. Finally, we showed that SE- 
QUENT yields competitive results for two representa- 


tive benchmark datasets compared to DQCs and clas- 
sical neural networks. Thus, we a provided proof- 
of-concept for both the proposed reduced architecture 
and the adapted transfer learning training procedure. 

However, whilst SEQUENT theoretically is appli- 
cable to any kind of VQC, we only considered the 
simple architecture with fixed angle embeddings and 
ô entangling layers as proposed by (Mari et al., 2020). 
Furthermore, we only supplied preliminary experi- 
mental implications and did not yet test any high di- 
mensional real-world applications. Overall, we do not 
expect superior results that outperform state-of-the- 
art approaches in the first place, as viable circuit ar- 
chitectures for quantum machine learning are still an 
active and fast-moving field of research. 

Thus, both the real world applicability and the de- 
velopment of circuit architectures that indeed offer 
a benefit over classical ones should undergo further 
research attention. To empower real-world applica- 
tions, the use of hybrid quantum methods should also 
be kept in mind when pre-training large classification 
models like Resnet. Also, applying more advanced 
techniques to train the pre-processing or compression 
layer to take full advantage of the chosen quantum 
circuit should be examined. Therefore, auto-encoder 
architectures might be applicable to train a more gen- 
eralized mapping from the classical input-space to 
the quantum-space. Overall, we belief, that applying 
the proposed concepts and building upon SEQUENT, 
both valuable hybrid applications and beneficial quan- 
tum circuit architectures can be found. 
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